64 research outputs found

    Neural plasma

    Get PDF
    This paper presents a novel type of artificial neural network, called neural plasma, which is tailored for classification tasks involving few observations with a large number of variables. Neural plasma learns to adapt its classification confidence by generating artificial training data as a function of its confidence in previous decisions. In contrast to multilayer perceptrons and similar techniques, which are inspired by topological and operational aspects of biological neural networks, neural plasma is motivated by aspects of high-level behavior and reasoning in the presence of uncertainty. The basic principles of the proposed model apply to other supervised learning algorithms that provide explicit classification confidence values. The empirical evaluation of this new technique is based on benchmarking experiments involving data sets from biotechnology that are characterized by the small-n-large-p problem. The presented study exposes a comprehensive methodology and is seen as a first step in exploring different aspects of this methodology.IFIP International Conference on Artificial Intelligence in Theory and Practice - Neural NetsRed de Universidades con Carreras en Informática (RedUNCI

    Instance-based concept learning from multiclass DNA microarray data

    Get PDF
    BACKGROUND: Various statistical and machine learning methods have been successfully applied to the classification of DNA microarray data. Simple instance-based classifiers such as nearest neighbor (NN) approaches perform remarkably well in comparison to more complex models, and are currently experiencing a renaissance in the analysis of data sets from biology and biotechnology. While binary classification of microarray data has been extensively investigated, studies involving multiclass data are rare. The question remains open whether there exists a significant difference in performance between NN approaches and more complex multiclass methods. Comparative studies in this field commonly assess different models based on their classification accuracy only; however, this approach lacks the rigor needed to draw reliable conclusions and is inadequate for testing the null hypothesis of equal performance. Comparing novel classification models to existing approaches requires focusing on the significance of differences in performance. RESULTS: We investigated the performance of instance-based classifiers, including a NN classifier able to assign a degree of class membership to each sample. This model alleviates a major problem of conventional instance-based learners, namely the lack of confidence values for predictions. The model translates the distances to the nearest neighbors into 'confidence scores'; the higher the confidence score, the closer is the considered instance to a pre-defined class. We applied the models to three real gene expression data sets and compared them with state-of-the-art methods for classifying microarray data of multiple classes, assessing performance using a statistical significance test that took into account the data resampling strategy. Simple NN classifiers performed as well as, or significantly better than, their more intricate competitors. CONCLUSION: Given its highly intuitive underlying principles – simplicity, ease-of-use, and robustness – the k-NN classifier complemented by a suitable distance-weighting regime constitutes an excellent alternative to more complex models for multiclass microarray data sets. Instance-based classifiers using weighted distances are not limited to microarray data sets, but are likely to perform competitively in classifications of high-dimensional biological data sets such as those generated by high-throughput mass spectrometry

    Comparative study of three commonly used continuous deterministic methods for modeling gene regulation networks

    Get PDF
    BACKGROUND: A gene-regulatory network (GRN) refers to DNA segments that interact through their RNA and protein products and thereby govern the rates at which genes are transcribed. Creating accurate dynamic models of GRNs is gaining importance in biomedical research and development. To improve our understanding of continuous deterministic modeling methods employed to construct dynamic GRN models, we have carried out a comprehensive comparative study of three commonly used systems of ordinary differential equations: The S-system (SS), artificial neural networks (ANNs), and the general rate law of transcription (GRLOT) method. These were thoroughly evaluated in terms of their ability to replicate the reference models' regulatory structure and dynamic gene expression behavior under varying conditions. RESULTS: While the ANN and GRLOT methods appeared to produce robust models even when the model parameters deviated considerably from those of the reference models, SS-based models exhibited a notable loss of performance even when the parameters of the reverse-engineered models corresponded closely to those of the reference models: this is due to the high number of power terms in the SS-method, and the manner in which they are combined. In cross-method reverse-engineering experiments the different characteristics, biases and idiosynchracies of the methods were revealed. Based on limited training data, with only one experimental condition, all methods produced dynamic models that were able to reproduce the training data accurately. However, an accurate reproduction of regulatory network features was only possible with training data originating from multiple experiments under varying conditions. CONCLUSIONS: The studied GRN modeling methods produced dynamic GRN models exhibiting marked differences in their ability to replicate the reference models' structure and behavior. Our results suggest that care should be taking when a method is chosen for a particular application. In particular, reliance on only a single method might unduly bias the results

    Reverse-engineering of gene regulation models from multi-condition experiments

    Get PDF

    Towards creative information exploration based on Koestler's concept of bisociation

    Get PDF
    Creative information exploration refers to a novel framework for exploring large volumes of heterogeneous information. In particular, creative information exploration seeks to discover new, surprising and valuable relationships in data that would not be revealed by conventional information retrieval, data mining and data analysis technologies. While our approach is inspired by work in the field of computational creativity, we are particularly interested in a model of creativity proposed by Arthur Koestler in the 1960s. Koestler’s model of creativity rests on the concept of bisociation. Bisociative thinking occurs when a problem, idea, event or situation is perceived simultaneously in two or more “matrices of thought” or domains. When two matrices of thought interact with each other, the result is either their fusion in a novel intellectual synthesis or their confrontation in a new aesthetic experience. This article discusses some of the foundational issues of computational creativity and bisociation in the context of creative information exploration
    corecore